An Efficient Sequential Covering Algorithm for Explaining Subsets of Data
نویسندگان
چکیده
Given a subset of data that differs from the rest, a user often wants an explanation as to why this is the case. For instance, in a database of flights, a user may want to understand why certain flights were very late. This paper presents ESCAPE, a sequential covering algorithm designed to generate explanations of subsets that take the form of disjunctive normal rules describing the characteristics ({attribute, value} pairs) that differentiates the subsets from the rest of the data. Our experiments demonstrate that ESCAPE discovers explanations that are both compact, in that just a few rules cover the subset, and specific, in that the rules cover the subset but not the rest of the data. Our experiments compare ESCAPE to RIPPER, a popular, traditional rule learning algorithm and show that ESCAPE’s rules yield better covering explanations. Further, ESCAPE was designed to be efficient, and we formally demonstrate that ESCAPE runs in loglinear time.
منابع مشابه
A TRUST-REGION SEQUENTIAL QUADRATIC PROGRAMMING WITH NEW SIMPLE FILTER AS AN EFFICIENT AND ROBUST FIRST-ORDER RELIABILITY METHOD
The real-world applications addressing the nonlinear functions of multiple variables could be implicitly assessed through structural reliability analysis. This study establishes an efficient algorithm for resolving highly nonlinear structural reliability problems. To this end, first a numerical nonlinear optimization algorithm with a new simple filter is defined to locate and estimate the most ...
متن کاملWell-dispersed subsets of non-dominated solutions for MOMILP problem
This paper uses the weighted L$_1-$norm to propose an algorithm for finding a well-dispersed subset of non-dominated solutions of multiple objective mixed integer linear programming problem. When all variables are integer it finds the whole set of efficient solutions. In each iteration of the proposed method only a mixed integer linear programming problem is solved and its optimal solutions gen...
متن کاملA Reliable Multi-objective p-hub Covering Location Problem Considering of Hubs Capabilities
In the facility location problem usually reducing total transferring cost and time are common objectives. Designing of a network with hub facilities can improve network efficiency. In this study a new model is presented for P-hub covering location problem. In the p-hub covering problem it is attempted to locate hubs and allocate customers to established hubs while allocated nodes to hubs are in...
متن کاملAn L1-norm method for generating all of efficient solutions of multi-objective integer linear programming problem
This paper extends the proposed method by Jahanshahloo et al. (2004) (a method for generating all the efficient solutions of a 0–1 multi-objective linear programming problem, Asia-Pacific Journal of Operational Research). This paper considers the recession direction for a multi-objective integer linear programming (MOILP) problem and presents necessary and sufficient conditions to have unbounde...
متن کاملEfficient Solution Procedure to Develop Maximal Covering Location Problem Under Uncertainty (Using GA and Simulation)
In this paper, we present the stochastic version of Maximal Covering Location Problem which optimizes both location and allocation decisions, concurrently. It’s assumed that traveling time between customers and distribution centers (DCs) is uncertain and described by normal distribution function and if this time is less than coverage time, the customer can be allocated to DC. In classical mod...
متن کامل